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The coronavirus pandemic has affected not only health but also the economy. 
The use of big data in finding information can be used to gain profits that 
logistics companies can utilize to survive during the pandemic. This study 
conducted text-mining research on service consultant sites in the logistics sector. 
This study aims to present frequency diagrams, analyze sentiment using the 
National Research Council (NRC) lexicon, present bigrams, and seek knowledge 
about strategies to minimize shipping costs and maintain inventories of 
manufactured goods. The words "supply", "chain", and "COVID-19" are words 
that are used frequently throughout the article. The results of this study showed 
that the words that often appear from word excavation are the words "supply", 
"chain", "logistics", "kpis," and "inventory". Then emotion trust becomes an 
emotional word that often appears in articles. The words "Supply" and "pandemic" 
are the words that seem the most positive and negative words, respectively. The 
words "COVID-19", "safety stock", and "inventory management" are words that 
often appear together. The result of discovery knowledge is that logistics 
consultants offer emotions of trust and provide many insights on minimizing 
shipping costs and maintaining inventory during a pandemic. 
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1. INTRODUCTION 


In recent years, the coronavirus disease, commonly called COVID-19, has attacked millions of people 
and resulted in more than 6 million deaths worldwide. COVID-19 is not only shooting from a health but also 
an economic perspective. Restrictions on social activities and lockdowns significantly affect supply chain (SC) 
flow in every line [1]. The government's limiting activities increase shipping costs, and production stocks are 
also depleted due to restrictions on work activities. Entrepreneurs in the logistics sector must be able to rack 
their brains to survive during the pandemic. 

Bureau of logistics is a private company in SC management consulting and logistics. The company is 
experienced in the field of end-to-end SC. The company was founded in 1997 and has grown into several 
companies that focus on consulting, education, and benchmarking SC and logistics. To date, the company has 


branches in 25 countries. 
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Logistics consultants play a role in helping companies in the SC and logistics sector in making the 
right decisions. Bureau logistics has a website www.logisticsbireau.com. Many articles are on the site, 
especially regarding SC strategies during a pandemic. Knowledge searches from logistics consultant websites 
can help companies understand quickly and gain useful knowledge. What knowledge can be gained from 
extracting data from the site? What sentiments often arise from the author in articles on the website? Moreover, 
what are the SC strategies used to reduce shipping costs and overcome product shortages during a pandemic? 
With this background, this paper aims to discover the knowledge of the logistics bureau website during the 
recent pandemic. This topic is theoretically essential to understanding how logistics consultants suggest 
strategies and what sentiment words often appear in their articles during a pandemic. 

This investigation is very important to understand the SC strategy implications of this phenomenon, 
which makes this research important. With this background, this research explores the insights that can be 
obtained from articles on the www.logisticsbireau.com website through machine learning. The methods that 
we use in this study include text mining, cleaning text, analyzing text using R Studio, and searching for 
knowledge by reading articles. 

In analyzing the text, we present the number of words that frequently occur in a frequency chart. Then 
we present it in the form of a word cloud. Emotional sentiment theory and lexicon are used to base the analysis. 
Web scrapping through machine learning (R Studio) was carried out to collect 16 article posts from the site 
from 2019 to 2022 containing SC strategy and COVID-19. Sentiment lexicon-based analysis was used to 
evaluate the sentiment scores of the sample articles. After analyzing the frequency, we try to explore science 
by reading articles to get strategies to minimize container shipping costs and maintain stock availability. 

In recent years, text mining on websites and social media has been carried out by many people, 
including [2]-[4]. Mining texts regarding SC have also been discussed by [5]-[7]. Much research has been done 
on SC and linkages with COVID-19 throughout 2019-2022 [8]. Strategies to minimize risk are also discussed 
by Trautrims et al. [9] and Zuhanda et al. [10] present the 2E-VRP model to minimize logistics shipping costs. 
Goncalves et al. [11] discuss the handling of stock availability. 

There were many studies on SC strategies during the COVID-19 pandemic. However, there is still 
little research on SA during the SC pandemic. Several studies related to SA showed by in Table 1. Most articles 
are still analyzing sentiment used via social media, namely Twitter. However, few still examine articles on 
sites that are experts on SC and logistics. The opinion of experts in the SC field will provide a new perspective 
and knowledge about the strategy of the SC during the pandemic. 


Table 1. Summerize of related work 


Num. Ref. Source Text Method 
1 Akundi et al. [12] Twitter SA and opinion analysis 
2 Shipley et al. [13] Blog, forum, and Twitter SA 
3 Sperry et al. [14] Twitter, survey, forum SA 
4 Treiblmaier and Mair [15] Interview professional Word clouds, SA, topic models, correspondence 
analysis, and multidimensional scaling. 
i) Khatua et al. [16] Twitter and Facebook SA 


There are still relatively few studies looking to analyze the relationship between the word bigram and 
text mining, related publications that explore bigram are [17], [18]. To get critical knowledge from the article's 
content, searching for learning by reading articles related to the question is necessary. After obtaining the 
answers to the research questions, they are presented in a Sankey diagram. Some publications that visualize 
Sankey diagrams include [19], [20]. For this purpose, the remainder of this paper is structured as follows. 
Section 2 offers the research methods. Section 3 draws results and discussion. Conclusions are presented in 
section 4. 


2. METHOD 
2.1. Text mining 

In this study, data collection was carried out through web scrapping on the www.logisticsbureau.com 
site with the keywords "supply", "chain", and "COVID-19". The purpose of extracting this text is to obtain text 
data which will later be collected using machine learning to gain insight from articles on websites. Furthermore, 
the data in the title and content of the articles were collected and arranged into a table. Data is collected in the 
form of tables which are stored in .csv format. To prepare for extracting text from a website, install the rvesr 
and dplyr packages in Rstudio. Then install SelectorGadget in Chrome; SelectorGadget is an open-source tool 
that helps create and find CSS selectors on sites. Several studies that conducted web scraping data mining 
include [21]-[23]. 
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2.2. Cleaning text 

After the article text has been collected, to be able to be analyzed, it is necessary to clean the words. 
In cleaning the text on the article, namely pre-processing the data, first removing the words that often appear 
for spaces such as "\n", “punctuation”, the text is changed to lowercase, separated the words text, and removed 
stopwords. In language, stopwords are words that can be ignored because they do not affect the meaning and 
information in a sentence if they are omitted. In splitting the text, the text is converted into a separate list of 
strings. Eliminating stopwords makes word sets from articles better to analyze for valuable insights [24], [25]. 

We will present the process of cleaning text, for example, an example of one of the content texts that 
has been extracted from the website "\nKnown as the ferrymen of Wuhan, thousands of motorbike riders ...". 
From this sentence, we show the process of stages as can be seen in Table 2 cleaning up to into word for word. 
From this process, it will be processed in the next stage for analysis using machine learning. 


Table 2. Cleaning process 


Num. Step by Step After Cleaned 
1. Remove “\n”, “punctuation”, "known as the ferrymen of wuhan, thousands of motorbike riders...” 
“URL” + change to lower case 
Me Separate words "known”, “as”, “the”, “ferrymen”, “of” , “wuhan”, “thousands”, “of” “motorbike” 
“riders”, ... 
Be Remove stopwords "known”, “ferrymen”, “wuhan”, “thousands”, “motorbike”, “riders”, ... 


2.3. Analyze text via Rstudio 

In the next stage, after the text is cleaned, the process analyses the number of word frequencies [26]. 
After cleaning and separating words, we can see the terms that often appear on the website. And after obtaining 
the number of words that frequently occur, then presented in the form of a word frequency diagram. After that, 
we can show it as a word cloud to easily understand the words. The next process in this paper is to analyze the 
emotional score of the text extracted from the website article. In this step, this study uses the National Research 
Council (NRC) lexicon method to define the number of emotional scores from the text [27], [28]. 

Next, we present the relationship of words to create a bigram network visualization [29]. Bigram 
analysis extracts and counts words that appear together in the cleaned text. A bigram is a sequence of two 
words that appear simultaneously. From this result, related words that often arise and are presented in a diagram 
will be obtained. The classification of emotions carried out will be used to answer questions. Ql: What types 
of emotions often arise from the articles presented on the website during the COVID-19 pandemic? 


2.4. Theoretical fundamentals and creating research questions 

Word frequency analysis presents words that are often used in the text. We can derive useful things 
from calculating the frequency. One of them is that you can understand the keywords of the content on the 
website that is being analyzed. This provides an overview of the situation or something the author wants to 
present. And you can quickly understand the content of the article without having to read it thoroughly. 

The emotional analysis is the basis for the current state of affairs. Is that good or bad? Several types 
of emotions influence decision-making. The use of words by the author reflects the content of the delivery to 
be conveyed. Analysis of the kinds of emotions classified as "sad", "happy", "believe", "fear", and others 
describe the current state of affairs. 

In recent years the classification of these types of emotions can be done by machine learning. There 
are several methods for classifying emotions, including the lexicon analysis method. Three types of lexicon 
analysis can be used, namely Bing, AFINN, and NRC [30]. The BING lexicon technique labels words into 
positive and negative categories. Meanwhile, the AFINN lexicon method labels words with a score between - 
5 and 5. AFINN's negative score represents negative sentiment and vice versa, a positive score. The NRC 
lexicon method classifies words into the categories of eight emotions and two sentiments [31]. The 
classification of emotions carried out will be used to answer questions. Q2: What types of emotions often arise 
from the articles presented on the website during the COVID-19 pandemic? 


3. RESULTS AND DISCUSSION 

What do you get from digging from 16 articles on the website? This paper presents visualizations of 
sample articles in several ways. This study uses two packages in Rstudio, wordcloud and ggplot to visualize 
the data. To build a graphical representation, we use the wordcloud package. Meanwhile, ggplot is used to 
represent the top 25 frequently used words from 16 articles analyzed from the website. 
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The word frequency analysis of data collected from articles on the www.logisticsbureau.com site 
illustrates what the SC consulting site discusses. Then, it provides an opportunity for in-depth analysis of the 
frequency of words that often appear. Figure | presents the frequencies of the 25 words that appear most 
frequently in public discussions on site. Here, we note that the words 'supply', ‘chain, ‘logistics’, kpis’, 
‘inventory’, ‘business’, 'shipping’, ‘cost’, and ‘time’ are the highest frequency words. This result does not involve 
stopwords, which are words that have little meaning for analysis. 

Figure 2 is a word cloud of words collected from 16 articles on the www.logisticsbureau.com website. 
The word cloud visualization can be interpreted that the more significant word is the word that appears more 
often than the other words. The articles collected can be analyzed the author's sentiments regarding the supply 
chain, freight costs, and product availability issues. The emotions of words in the text were analyzed using all 
corpus review data using the NRC word-emotion association lexicon [32]. Table 3 presents the emotional 
scores of the 5 article titles using all of the corpus review data using the NRC lexicon. The table presents the 
classification of eight emotions and two sentiments using the NRC lexicon method from 5 sample articles. And 
then, it will add the total of each emotion and sentiment. Table 4 shows the total emotional score of all the 
extracted articles. The word anger emotion appears 4.99% (160), anticipation emotion is 9.88% (317), disgust 
emotion is 2.52% (81), fear emotion is 7.51% (241), joy emotion is 5.61% (180), sadness emotion is 6.26% 
(201), surprise emotion is 3.46% (111), trust emotion is emotion 16.20% (520), negative sentiments as much 
as 13.62% (437), and positive sentiments as much as 29.95% (961). 
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Figure 1. The top 25 most common words Figure 2. Word cloud 


Table 3. Emotions score 
Num. Title article Emotions Score 
anger _anticipations disgust fear joy sadness surprise trust negative _ positive 
1. "COVID-19 4 5 3 8 3 5 3 12 13 24 
Pandemic Triggers 
Surge in Global Food 
Delivery Industry" 
Ds Why Containerised 6 19 3 16 11 14 4 28 23 54 
Freight Shipping is 
Daunting for SMEs" 
3. "The Challenge of 4 18 2 6 14 10 11 34 17 60 
Freight Container 
Utilisation and Why 
it Matters" 
4. "Container Freight 9 19 5 15 9 9 6 33 23 48 
Costs and 
Forecasting: 
Intrinsically Linked 
and Frustratingly 
Challenging" 
5. "Post-COVID 18 34 10 28 «21 22 12 54 49 116 
Ecommerce is 
Booming, But 
Logistics Issues 
Abound" 
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Table 4. Total emotions score 
Num. TypeofEmotions Total Num Typeof Emotions _ Total 


1. Anger 160 6. Sadnes 201 
2. Anticipation 317 vies Surprise 111 
3. Disgust 81 8. Trust 520 
4. Fear 241 9. Negative 437 
5. Joy 180 10. Positive 961 


The calculation of emotions will be presented in a bar chart, as shown in Figure 3. The visualization of 
SA using Rstudio, which can be seen in Figure 3(a), shows that emotion trust is a word often used in writing, 
followed by anticipation, fear, sadness, joy, anger, surprise, and disgust. Figure 3(b) shows the frequency of SA 
containing positive and negative words. The words containing the top positive sentiments that often appear are 
"supply", "delivery", and "management". While words that have negative sentiments are "pandemic", "demand", 
and "serve". The visualization in the image shows that the use of positive words is used more than negative words. 
From the top 20 words, there are sentiments containing 6 negative words and 14 positive words. 

Figure 4 presents a network of words. Figure 4(a) presents a bigram network of frequently occurring words. 
A thick network indicates the word is related and is used more often than other words with a thin network. 
Furthermore, to gain knowledge about minimizing shipping costs and maintaining stock availability. We read the 
text and summarized it into a diagram. Figure 4(b) is a Sankey diagram from an analysis of the contents of the articles 
on controlling stock availability and minimizing freight costs. From the chart, we can interpret strategies to reduce 
container shipping costs by adjusting container sizes, minimizing shipping routes, combining shipments, cutting SC 
flows, and minimizing containment costs to maintain product stock and maintain strategic measures. To take and 
optimize delivery routes, control production orders, SC efficiency, and diversify the SC. 


Type of Emotions in 'www.logisticsbureau.com' 
negative positive 
500 pandemic zo delivery - 61 
management - 
demand - 2 customer ~ 40 
400 store - BB 
= improve i 
z 300 4 planning - 3 
8 = nclude - 4 
food -4i8 
200 rob 4 traditional 48 
provide 48 
100 disruption 4 increase ~1 
focus 18 
Hi } 50 100 150 200 0 50 100 150 200 
Word Count 


trust anticipation fear sadness joy anger surprise disgust _ 


(a) (b) 


Figures 3. Sentiment analysis using NRC lexicon: (a) visualization of type of emotion and (b) visualization of 
negative and positive words 
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Figures 4. Knowledge discovery: (a) bigram network and (b) sankey diagram strategy supply chain 
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4. CONCLUSION 
Going back to the question in the introduction, what do we get from extracting text mining from a 
website? We gain insight into the emotional sentiment used in the article. The article builds many emotions of 


mow 


trust and uses a lot of positive sentiments in the article. The words "supply", "chain", "logistics", "kpis," and 
"inventory" are words that are used frequently throughout the article. The words "COVID-19", "safety stock", 
and "inventory management" are words that often appear together. From extracting this big data, we gain 
insight into strategies to minimize container shipping costs by adjusting container sizes, minimizing shipping 
routes, combining shipments, cutting SC flows, and minimizing containment costs to maintain product stock 
and always maintain strategic measures. To take, optimizing delivery routes, control production orders, SC 
efficiency, and SC diversification. Future research will make efforts to minimize delivery routes by examining 
the problem of vehicle routes. The result of discovery knowledge is that logistics consultants offer emotions of 
trust and provide many insights on minimizing shipping costs and maintaining inventory during a pandemic. 
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